generation algorithm
Scribbles for All: Benchmarking Scribble Supervised Segmentation Across Datasets
In this work, we introduce, a label and training data generation algorithm for semantic segmentation trained on scribble labels. Training or fine-tuning semantic segmentation models with weak supervision has become an important topic recently and was subject to significant advances in model quality. In this setting, scribbles are a promising label type to achieve high quality segmentation results while requiring a much lower annotation effort than usual pixel-wise dense semantic segmentation annotations. The main limitation of scribbles as source for weak supervision is the lack of challenging datasets for scribble segmentation, which hinders the development of novel methods and conclusive evaluations. To overcome this limitation, provides scribble labels for several popular segmentation datasets and provides an algorithm to automatically generate scribble labels for any dataset with dense annotations, paving the way for new insights and model advancements in the field of weakly supervised segmentation. In addition to providing datasets and algorithm, we evaluate state-of-the-art segmentation models on our datasets and show that models trained with our synthetic labels perform competitively with respect to models trained on manual labels. Thus, our datasets enable state-of-the-art research into methods for scribble-labeled semantic segmentation.
On Deepfake Voice Detection -- It's All in the Presentation
Delgado, Héctor, Ramondetti, Giorgio, Dalmasso, Emanuele, Karvitsky, Gennady, Colibro, Daniele, Talib, Haydar
While the technologies empowering malicious audio deepfakes have dramatically evolved in recent years due to generative AI advances, the same cannot be said of global research into spoofing (deepfake) countermeasures. This paper highlights how current deepfake datasets and research methodologies led to systems that failed to generalize to real world application. The main reason is due to the difference between raw deepfake audio, and deepfake audio that has been presented through a communication channel, e.g. by phone. We propose a new framework for data creation and research methodology, allowing for the development of spoofing countermeasures that would be more effective in real-world scenarios. By following the guidelines outlined here we improved deepfake detection accuracy by 39% in more robust and realistic lab setups, and by 57% on a real-world benchmark. We also demonstrate how improvement in datasets would have a bigger impact on deepfake detection accuracy than the choice of larger SOTA models would over smaller models; that is, it would be more important for the scientific community to make greater investment on comprehensive data collection programs than to simply train larger models with higher computational demands.
A Appendix
Macros are relatively large, including DRAMs, caches, and IO interfaces. Pins are input/output interfaces for modules and are connected by wires directly, which have A net contains a set of pins connected by the same wires. Pins from the same net can form a net bounding box as Fig.8 It is the sum of half perimeter of net bounding boxes as Fig.8 (a)(b), where We give a set of placement results to explain the metrics in Fig.8. The density of Fig.8 (c) is 2.0 because g Relationship between pin offset and HPWL. The pin offset can affect the HPWL.
Empowering Morphing Attack Detection using Interpretable Image-Text Foundation Model
Patwardhan, Sushrut, Ramachandra, Raghavendra, Venkatesh, Sushma
Morphing attack detection has become an essential component of face recognition systems for ensuring a reliable verification scenario. In this paper, we present a multimodal learning approach that can provide a textual description of morphing attack detection. We first show that zero-shot evaluation of the proposed framework using Contrastive Language-Image Pretraining (CLIP) can yield not only generalizable morphing attack detection, but also predict the most relevant text snippet. We present an extensive analysis of ten different textual prompts that include both short and long textual prompts. These prompts are engineered by considering the human understandable textual snippet. Extensive experiments were performed on a face morphing dataset that was developed using a publicly available face biometric dataset. We present an evaluation of SOT A pre-trained neural networks together with the proposed framework in the zero-shot evaluation of five different morphing generation techniques that are captured in three different mediums.
Scribbles for All: Benchmarking Scribble Supervised Segmentation Across Datasets
In this work, we introduce Scribbles for All, a label and training data generation algorithm for semantic segmentation trained on scribble labels. Training or fine-tuning semantic segmentation models with weak supervision has become an important topic recently and was subject to significant advances in model quality. In this setting, scribbles are a promising label type to achieve high quality segmentation results while requiring a much lower annotation effort than usual pixel-wise dense semantic segmentation annotations. The main limitation of scribbles as source for weak supervision is the lack of challenging datasets for scribble segmentation, which hinders the development of novel methods and conclusive evaluations. To overcome this limitation, Scribbles for All provides scribble labels for several popular segmentation datasets and provides an algorithm to automatically generate scribble labels for any dataset with dense annotations, paving the way for new insights and model advancements in the field of weakly supervised segmentation.
High-Quality Pseudo-Label Generation Based on Visual Prompt Assisted Cloud Model Update
Xu, Xinrun, Zhang, Qiuhong, Yang, Jianwen, Lian, Zhanbiao, Yan, Jin, Ding, Zhiming, Jiang, Shan
--Generating high-quality pseudo-labels on the cloud side is crucial for cloud-edge collaborative object detection, especially in dynamic traffic monitoring scenarios where the target data distribution continuously evolves. Existing methods often assume a perfectly reliable cloud model, neglecting the potential for errors in the cloud's predictions, or employ simple adaptation techniques that struggle to handle complex distribution shifts. This paper proposes a novel Cloud-Adaptive High-Quality Pseudo-label generation algorithm (CA-HQP) that addresses these limitations by incorporating a learnable Visual Prompt Generator (VPG) and a dual feature alignment strategy into the cloud model updating process. The VPG enables parameter-efficient adaptation of the large pre-trained cloud model by injecting task-specific visual prompts into the model's input, enhancing its flexibility without extensive fine-tuning. T o mitigate domain discrepancies, CA-HQP introduces two complementary feature alignment techniques: a global Domain Query Feature Alignment (DQF A) that captures scene-level distribution shifts and a fine-grained T emporal Instance-A ware Feature Embedding Alignment (TIAF A) that addresses instance-level variations. Extensive experiments on the Bellevue traffic dataset, a challenging real-world traffic monitoring dataset, demonstrate that CA-HQP significantly improves the quality of pseudo-labels compared to existing state-of-the-art cloud-edge collaborative object detection methods. Further ablation studies validate the contribution of each individual component (DQF A, TIAF A, VPG) and confirm the synergistic effect of combining global and instance-level feature alignment strategies.
SynthRAD2025 Grand Challenge dataset: generating synthetic CTs for radiotherapy
Thummerer, Adrian, van der Bijl, Erik, Galapon, Arthur Jr, Kamp, Florian, Savenije, Mark, Muijs, Christina, Aluwini, Shafak, Steenbakkers, Roel J. H. M., Beuel, Stephanie, Intven, Martijn P. W., Langendijk, Johannes A., Both, Stefan, Corradini, Stefanie, Rogowski, Viktor, Terpstra, Maarten, Wahl, Niklas, Kurz, Christopher, Landry, Guillaume, Maspero, Matteo
Medical imaging is essential in modern radiotherapy, supporting diagnosis, treatment planning, and monitoring. Synthetic imaging, particularly synthetic computed tomography (sCT), is gaining traction in radiotherapy. The SynthRAD2025 dataset and Grand Challenge promote advancements in sCT generation by providing a benchmarking platform for algorithms using cone-beam CT (CBCT) and magnetic resonance imaging (MRI). The dataset includes 2362 cases: 890 MRI-CT and 1472 CBCT-CT pairs from head-and-neck, thoracic, and abdominal cancer patients treated at five European university medical centers (UMC Groningen, UMC Utrecht, Radboud UMC, LMU University Hospital Munich, and University Hospital of Cologne). Data were acquired with diverse scanners and protocols. Pre-processing, including rigid and deformable image registration, ensures high-quality, modality-aligned images. Extensive quality assurance validates image consistency and usability. All imaging data is provided in MetaImage (.mha) format, ensuring compatibility with medical image processing tools. Metadata, including acquisition parameters and registration details, is available in structured CSV files. To maintain dataset integrity, SynthRAD2025 is divided into training (65%), validation (10%), and test (25%) sets. The dataset is accessible at https://doi.org/10.5281/zenodo.14918089 under the SynthRAD2025 collection. This dataset supports benchmarking and the development of synthetic imaging techniques for radiotherapy applications. Use cases include sCT generation for MRI-only and MR-guided photon/proton therapy, CBCT-based dose calculations, and adaptive radiotherapy workflows. By integrating diverse acquisition settings, SynthRAD2025 fosters robust, generalizable image synthesis algorithms, advancing personalized cancer care and adaptive radiotherapy.